Identification of Fertile Translations in Medical Comparable Corpora: a Morpho-Compositional Approach
نویسندگان
چکیده
This paper defines a method for lexicon in the biomedical domain from comparable corpora. The method is based on compositional translation and exploits morpheme-level translation equivalences. It can generate translations for a large variety of morphologically constructed words and can also generate ’fertile’ translations. We show that fertile translations increase the overall quality of the extracted lexicon for English to French translation.
منابع مشابه
Revising the Compositional Method for Terminology Acquisition from Comparable Corpora
In this paper, we present a new method that improves the alignment of equivalent terms monolingually acquired from bilingual comparable corpora: the Compositional Method with Context-Based Projection (CMCBP). Our overall objective is to identify and to translate high specialized terminology made up of multi-word terms acquired from comparable corpora. Our evaluation in the medical domain and fo...
متن کاملUtilizing Citations of Foreign Words in Corpus-Based Dictionary Generation
Previous work concerned with the identification of word translations from text collections has been either based on parallel or on comparable corpora of the respective languages. In the case of comparable corpora basic dictionaries have been necessary to form a bridge between the languages under consideration. We present here a novel approach to identify word translations from a single monoling...
متن کاملCombining String and Context Similarity for Bilingual Term Alignment from Comparable Corpora
Automatically compiling bilingual dictionaries of technical terms from comparable corpora is a challenging problem, yet with many potential applications. In this paper, we exploit two independent observations about term translations: (a) terms are often formed by corresponding sub-lexical units across languages and (b) a term and its translation tend to appear in similar lexical context. Based ...
متن کاملImproving Compositional Translation with Comparable Corpora
We improved the compositional term translation method by using comparable corpora. A bilingual lexicon consisting of pairs of word sequences within terms and their correlations is derived from a bilingual document-aligned corpus. Then, for an input term, compositional translations are produced together with their confidence scores by consulting the corpus-derived bilingual lexicon. Thus, we can...
متن کاملMining New Word Translations from Comparable Corpora
New words such as names, technical terms, etc appear frequently. As such, the bilingual lexicon of a machine translation system has to be constantly updated with these new word translations. Comparable corpora such as news documents of the same period from different news agencies are readily available. In this paper, we present a new approach to mining new word translations from comparable corp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1209.2400 شماره
صفحات -
تاریخ انتشار 2012